Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning

نویسندگان

  • Mingzhen Mo
  • Irwin King
چکیده

With the rapid growth of the Internet, more and more people interact with their friends in online social networks like Facebook. Currently, the privacy issue of online social networks becomes a hot and dynamic research topic. Though some privacy protecting strategies are implemented, they are not stringent enough. Recently, Semi-Supervised Learning (SSL), which has the advantage of utilizing the unlabeled data to achieve better performance, attracts much attention from the web research community. By utilizing a large number of unlabeled data from websites, SSL can effectively infer hidden or sensitive information on the Internet. Furthermore, graph-based SSL is much more suitable for modeling real-world objects with graph characteristics, like online social networks. Thus, we propose a novel Community-based Graph (CG) SSL model that can be applied to exploit security issues in online social networks, then provide two consistent algorithms satisfying distinct needs. In order to evaluate the effectiveness of this model, we conduct a series of experiments on a synthetic data and two real-world data from StudiVZ and Facebook. Experimental results demonstrate that our approach can more accurately and confidently predict sensitive information of online users, comparing to previous models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning from Partially Labeled Data: Unsupervised and Semi-supervised Learning on Graphs and Learning with Distribution Shifting

This thesis focuses on two fundamental machine learning problems: unsupervised learning, where no label information is available, and semi-supervised learning, where a small amount of labels are given in addition to unlabeled data. These problems arise in many real word applications, such as Web analysis and bioinformatics, where a large amount of data is available, but no or only a small amoun...

متن کامل

Multilabel user classification using the community structure of online networks

We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a socia...

متن کامل

Detecting Overlapping Communities in Social Networks using Deep Learning

In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...

متن کامل

An Optimized Firefly Algorithm based on Cellular Learning Automata for Community Detection in Social Networks

The structure of the community is one of the important features of social networks. A community is a sub graph which nodes have a lot of connections to nodes of inside the community and have very few connections to nodes of outside the community. The objective of community detection is to separate groups or communities that are linked more closely. In fact, community detection is the clustering...

متن کامل

Semi-supervised Learning for Convolutional Neural Networks via Online Graph Construction

The recent promising achievements of deep learning rely on the large amount of labeled data. Considering the abundance of data on the web, most of them do not have labels at all. Therefore, it is important to improve generalization performance using unlabeled data on supervised tasks with few labeled instances. In this work, we revisit graph-based semi-supervised learning algorithms and propose...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010